16 research outputs found
MEGA: Multilingual Evaluation of Generative AI
Generative AI models have shown impressive performance on many Natural
Language Processing tasks such as language understanding, reasoning, and
language generation. An important question being asked by the AI community
today is about the capabilities and limits of these models, and it is clear
that evaluating generative AI is very challenging. Most studies on generative
LLMs have been restricted to English and it is unclear how capable these models
are at understanding and generating text in other languages. We present the
first comprehensive benchmarking of generative LLMs - MEGA, which evaluates
models on standard NLP benchmarks, covering 16 NLP datasets across 70
typologically diverse languages. We compare the performance of generative LLMs
including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive
models on these tasks to determine how well generative models perform compared
to the previous generation of LLMs. We present a thorough analysis of the
performance of models across languages and tasks and discuss challenges in
improving the performance of generative LLMs on low-resource languages. We
create a framework for evaluating generative LLMs in the multilingual setting
and provide directions for future progress in the field.Comment: EMNLP 202
End-to-end Privacy Preserving Training and Inference for Air Pollution Forecasting with Data from Rival Fleets
Privacy-preserving machine learning (PPML) promises to train
machine learning (ML) models by combining data spread across
multiple data silos. Theoretically, secure multiparty computation
(MPC) allows multiple data owners to train models on their joint
data without revealing the data to each other. However, the prior
implementations of this secure training using MPC have three limitations: they have only been evaluated on CNNs, and LSTMs have
been ignored; fixed point approximations have affected training
accuracies compared to training in floating point; and due to significant latency overheads of secure training via MPC, its relevance
for practical tasks with streaming data remains unclear.
The motivation of this work is to report our experience of addressing the practical problem of secure training and inference
of models for urban sensing problems, e.g., traffic congestion estimation, or air pollution monitoring in large cities, where data
can be contributed by rival fleet companies while balancing the
privacy-accuracy trade-offs using MPC-based techniques.
Our first contribution is to design a custom ML model for this
task that can be efficiently trained with MPC within a desirable
latency. In particular, we design a GCN-LSTM and securely train
it on time-series sensor data for accurate forecasting, within 7
minutes per epoch. As our second contribution, we build an end-toend system of private training and inference that provably matches
the training accuracy of cleartext ML training. This work is the first
to securely train a model with LSTM cells. Third, this trained model
is kept secret-shared between the fleet companies and allows clients
to make sensitive queries to this model while carefully handling
potentially invalid queries. Our custom protocols allow clients to
query predictions from privately trained models in milliseconds,
all the while maintaining accuracy and cryptographic securit
‘Beach’ to ‘Bitch’: Inadvertent Unsafe Transcription of Kids’ Content on YouTube
Over the last few years, YouTube Kids has emerged as one of the highly competitive alternatives to television for children's entertainment. Consequently, YouTube Kids' content should receive an additional level of scrutiny to ensure children's safety. While research on detecting offensive or inappropriate content for kids is gaining momentum, little or no current work exists that investigates to what extent AI applications can (accidentally) introduce content that is inappropriate for kids.
In this paper, we present a novel (and troubling) finding that well-known automatic speech recognition (ASR) systems may produce text content highly inappropriate for kids while transcribing YouTube Kids' videos. We dub this phenomenon as inappropriate content hallucination. Our analyses suggest that such hallucinations are far from occasional, and the ASR systems often produce them with high confidence. We release a first-of-its-kind data set of audios for which the existing state-of-the-art ASR systems hallucinate inappropriate content for kids. In addition, we demonstrate that some of these errors can be fixed using language models
Analysis of thermal comfort properties of tri-layer knitted fabrics
The tri-layer knitted fabrics created for the aim of active sportswear have been enhanced with the help of microdenier filament polyester yarn, spun polyester yarn, polypropylene, and cotton in this study. These developed tri-layer knitted fabrics are then examined for the thermal comfort properties. The results evidently showed that Microdenier Polyester/Microdenier Polyester/Cotton tri-layer knitted fabrics combination shows exceptionally appreciable thermal comfort properties due to their structural factors such as filamentous nature, lesser thickness, low areal density, and lesser bulkiness. The effect of fiber chosen also plays a crucial part with respect to the thermal comfort properties of tri-layer fabrics developed. Samples such as Microdenier Polyester/Polypropylene/Cotton, Polypropylene/Microdenier Polyester/Cotton also performed better next to that of the Microdenier Polyester/Microdenier Polyester/Cotton combination because polypropylene also possesses a good wicking characteristic. A poor thermal behavior was found in the Microdenier Polyester/Cotton/Polypropylene sample because of the reasons such as protruding fibers of cotton, increased thickness, high areal density, etc. Also on comparing between the filament and the spun yarn, the filament yarn is highly recommended due to its appreciable behavior. Results evidently show that Microdenier Polyester/Microdenier Polyester/Cotton combination possesses an exceptionally appreciable thermal comfort property
Tidal dynamics and rainfall control N<SUB>2</SUB>O and CH<SUB>4</SUB> emissions from a pristine mangrove creek
Dissolved CH4, N2O, O2, and inorganic nitrogen nutrients (NH4+, NO3− and NO2−) were measured over tidal cycles in pristine Wright Myo mangrove creek waters during dry and wet seasons. Dissolved CH4 and N2O showed no seasonality (dry season; 491 ± 133 nmol CH4 l−1, 9.0 ± 2.3 nmol N2O l−1, wet season; 466 ± 94 nmol CH4 l−1, 8.6 ± 1.3 nmol N2O l−1). Creek water dissolved gas and inorganic nitrogen distributions reflect sediment porewater release during hydrostatic pressure drop toward low water. Creek water CH4 emission was suppressed by oxidation during rainfall, consistent with changes to dissolved nitrogen speciation, although N2O emissions were unaffected. Scaling up emissions flux estimates from mangrove creek waters and intertidal sediment gives worldwide mangrove emissions ~1.3 × 1011 mol CH4 yr−1 and 2.7 × 109 mol N2O yr−1; mangrove ecosystems are thus small contributors to coastal N2O emissions but could dominate coastal CH4 emissions. Comparing our data with mangrove CO2 fluxes, mangrove ecosystems could be small net contributors of atmospheric greenhouse gases
Serum Ferritin;
The aim of the study is to investigate the levels of hormone, Lipid, Iron and Vitamins in the serum of the primary infertility women. There are many biological causes of infertility, including some that medical intervention can treat. The blood samples collected were analysed for hormone (LH,FSH,PROLACTIN,ESTRADIOL),LIPIDS (cholesterol, triglycerides, HDL VLDL,LDL), Iron, Haemoglobin, Serum, Ferritin and vitamins (D,E,C).The results were analysed with graph pad prism software and are tabulated as follows. The level of LH, FSH, PROLACTIN and ESTRADIOL were found to be increased in the test group on comparison with control. The level of CHOLESTEROL, LDL, VLDL were found to be increased in the test group on comparison with control. The level of TRIGLYCERIDES and HDL were found to be decreased in the test group on comparison with control. The level of IRON and HAEMOGLOBIN and were found to be decreased in the test group on comparison with the control. The level of Serum Ferritin was found to be increased in the test group on comparison with the control. The level of Vitamin D and C were found to be decreased in the test group on comparison with the control. The level of Vitamin E was found to be increased in the test group on comparison with the control